38 research outputs found

    Alpha MAML: Adaptive Model-Agnostic Meta-Learning

    Full text link
    Model-agnostic meta-learning (MAML) is a meta-learning technique to train a model on a multitude of learning tasks in a way that primes the model for few-shot learning of new tasks. The MAML algorithm performs well on few-shot learning problems in classification, regression, and fine-tuning of policy gradients in reinforcement learning, but comes with the need for costly hyperparameter tuning for training stability. We address this shortcoming by introducing an extension to MAML, called Alpha MAML, to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates. Our results with the Omniglot database demonstrate a substantial reduction in the need to tune MAML training hyperparameters and improvement to training stability with less sensitivity to hyperparameter choice.Comment: 6th ICML Workshop on Automated Machine Learning (2019

    Domain invariant representation learning with domain density transformations

    Get PDF
    Domain generalization refers to the problem where we aim to train a model on data from a set of source domains so that the model can generalize to unseen target domains. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalize imperfectly to target domains. To tackle this problem, a predominant approach is to find and learn some domain-invariant information in order to use it for the prediction task. In this paper, we propose a theoretically grounded method to learn a domain-invariant representation by enforcing the representation network to be invariant under all transformation functions among domains. We also show how to use generative adversarial networks to learn such domain transformations to implement our method in practice. We demonstrate the effectiveness of our method on several widely used datasets for the domain generalization problem, on all of which we achieve competitive results with state-of-the-art models

    High-Cadence Thermospheric Density Estimation enabled by Machine Learning on Solar Imagery

    Full text link
    Accurate estimation of thermospheric density is critical for precise modeling of satellite drag forces in low Earth orbit (LEO). Improving this estimation is crucial to tasks such as state estimation, collision avoidance, and re-entry calculations. The largest source of uncertainty in determining thermospheric density is modeling the effects of space weather driven by solar and geomagnetic activity. Current operational models rely on ground-based proxy indices which imperfectly correlate with the complexity of solar outputs and geomagnetic responses. In this work, we directly incorporate NASA's Solar Dynamics Observatory (SDO) extreme ultraviolet (EUV) spectral images into a neural thermospheric density model to determine whether the predictive performance of the model is increased by using space-based EUV imagery data instead of, or in addition to, the ground-based proxy indices. We demonstrate that EUV imagery can enable predictions with much higher temporal resolution and replace ground-based proxies while significantly increasing performance relative to current operational models. Our method paves the way for assimilating EUV image data into operational thermospheric density forecasting models for use in LEO satellite navigation processes.Comment: Accepted at the Machine Learning and the Physical Sciences workshop, NeurIPS 202

    KL guided domain adaptation

    Get PDF
    Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. training and testing datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in the distribution. A common approach in the domain adaptation literature is to learn a representation of the input that has the same (marginal) distribution over the source and the target domain. However, these approaches often require additional networks and/or optimizing an adversarial (minimax) objective, which can be very expensive or unstable in practice. To improve upon these marginal alignment techniques, in this paper, we first derive a generalization bound for the target loss based on the training loss and the reverse Kullback-Leibler (KL) divergence between the source and the target representation distributions. Based on this bound, we derive an algorithm that minimizes the KL term to obtain a better generalization to the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples without any additional network or a minimax objective. This leads to a theoretically sound alignment method which is also very efficient and stable in practice. Experimental results also suggest that our method outperforms other representation-alignment approaches

    KL Guided Domain Adaptation

    Get PDF
    Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in the distribution. A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain. However, these approaches often require additional networks and/or optimizing an adversarial (minimax) objective, which can be very expensive or unstable in practice. To tackle this problem, we first derive a generalization bound for the target loss based on the training loss and the reverse Kullback-Leibler (KL) divergence between the source and the target representation distributions. Based on this bound, we derive an algorithm that minimizes the KL term to obtain a better generalization to the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples without any additional network or a minimax objective. This leads to a theoretically sound alignment method which is also very efficient and stable in practice. Experimental results also suggest that our method outperforms other representation-alignment approaches
    corecore